22 research outputs found

    Breaking and Fixing Secure Similarity Approximations: Dealing with Adversarially Perturbed Inputs

    Get PDF
    Computing similarity between data is a fundamental problem in information retrieval and data mining. To address the relevant performance and scalability challenges, approximation methods are employed for large-scale similarity computation. A common characteristic among all privacy- preserving approximation protocols based on sketching is that the sketching is performed locally and is based on common randomness. In the semi-honest model the input to the sketching algorithm is independent of the common randomness. We, however, consider a new threat model where a party is allowed to use the common randomness to perturb her input 1) offline, and 2) before the execution of any secure protocol so as to steer the approximation result to a maliciously chosen output. We formally define perturbation attacks under this adversarial model and propose two attacks on the well-studied techniques of minhash and cosine sketching. We demonstrate the power of perturbation attacks by measuring their success on synthetic and real data. To mitigate such perturbation attacks we propose a server- aided architecture, where an additional party, the server, assists in the secure similarity approximation by handling the common randomness as private data. We revise and introduce the necessary secure protocols so as to apply minhash and cosine sketching techniques in the server-aided architecture. Our implementation demonstrates that this new design can mitigate offline perturbation attacks without sacrificing the efficiency and scalability of the reconstruction protocol

    The State of the Uniform: Attacks on Encrypted Databases Beyond the Uniform Query Distribution

    Get PDF
    Recent foundational work on leakage-based attacks on encrypted databases has broadened our understanding of what an adversary can accomplish with a standard leakage profile. Nevertheless, all known value reconstruction attacks succeed under strong assumptions that may not hold in the real world. The most prevalent assumption is that queries should be issued uniformly at random by the client. We present the first value reconstruction attacks for encrypted databases without any assumptions about the query or data distribution. Our approach uses the search pattern leakage, which exists in all known structured encryption schemes but has not been effectively utilized so far. At the core of our method lies a support size estimator, a technique that utilizes the repetition of search tokens with the same response to estimate distances between encrypted values without any assumptions about the underlying distribution. We develop distribution-agnostic reconstruction attacks for both range queries and k-nearest-neighbor (k-NN) queries based on information extracted from the search pattern leakage. Our new range attack follows a different algorithmic approach than state-of-the-art attacks, which are fine-tuned to succeed under the uniform queries. Instead, we reconstruct plaintext values under a variety of skewed query distributions and even outperform the accuracy of previous approaches under uniform query distribution. Our new k-NN attack succeeds with far fewer samples than a previously proposed attack and scales to much larger values of k. We demonstrate the effectiveness of our attacks by experimentally testing them on a wide range of query distributions and database densities, both unknown to the adversary

    Data Recovery on Encrypted Databases With k-Nearest Neighbor Query Leakage

    Get PDF
    Recent works by Kellaris et al. (CCS’16) and Lacharite et al. (SP’18) demonstrated attacks of data recovery for encrypted databases that support rich queries such as range queries. In this paper, we develop the first data recovery attacks on encrypted databases supporting one-dimensional k-nearest neighbor (k-NN) queries, which are widely used in spatial data management. Our attacks exploit a generic k-NN query leakage profile: the attacker observes the identifiers of matched records. We consider both unordered responses, where the leakage is a set, and ordered responses, where the leakage is a k-tuple ordered by distance from the query point. As a first step, we perform a theoretical feasibility study on exact reconstruction, i.e., recovery of the exact plaintext values of the encrypted database. For ordered responses, we show that exact reconstruction is feasible if the attacker has additional access to some auxiliary information that is normally not available in practice. For unordered responses, we prove that exact reconstruction is impossible due to the infinite number of valid reconstructions. As a next step, we propose practical and more realistic approximate reconstruction attacks so as to recover an approximation of the plaintext values. For ordered responses, we show that after observing enough query responses, the attacker can approximate the client’s encrypted database with considerable accuracy. For unordered responses we characterize the set of valid reconstructions as a convex polytope in a k-dimensional space and present a rigorous attack that reconstructs the plaintext database with bounded approximation error. As multidimensional spatial data can be efficiently processed by mapping it to one dimension via Hilbert curves, we demonstrate our approximate reconstruction attacks on privacy-sensitive geolocation data. Our experiments on real-world datasets show that our attacks reconstruct the plaintext values with relative error ranging from 2.9% to 0.003%

    More Practical and Secure History-Independent Hash Tables

    Get PDF
    Direct-recording electronic (DRE) voting systems have been used in several countries including United States, India, and the Netherlands to name a few. In the majority of those cases, researchers discovered several security flaws in the implementation and architecture of the voting system. A common property of the machines inspected was that the votes were stored sequentially according to the time they were cast, which allows an attacker to break the anonymity of the voters using some side-channel information. Subsequent research (Molnar et al. Oakland’06, Bethencourt et al. NDSS’07, Moran et al. ICALP’07) pointed out the connection between vote storage and history-independence, a privacy property that guarantees that the system does not reveal the sequence of operations that led to the current state. There are two flavors of history-independence. In a weakly history-independent data structure, every possible sequence of operations consistent with the current set of items is equally likely to have occurred. In a strongly history-independent data structure, items must be stored in a canonical way, i.e., for any set of items, there is only one possible memory representation. Strong history- independence implies weak history-independence but considerably constrains the design choices of the data structures. In this work, we present and analyze an efficient hash table data structure that simultaneously achieves the following properties: • It is based on the classic linear probing collision-handling scheme. • It is weakly history-independent. • It is secure against collision-timing attacks. That is, we consider adversaries that can measure the time for an update operation, but cannot observe data values, and we show that those adversaries cannot learn information about the items in the table. • All operations are significantly faster in practice (in particular, almost 2x faster for high load factors) than those of the commonly used strongly history-independent linear probing method proposed by Blelloch and Golovin (FOCS’07), which is not secure against collision-timing attacks. The first property is desirable for ease of implementation. The second property is desirable for the sake of maximizing privacy in scenarios where the memory of the hash table is exposed, such as post-election audit of DRE voting machines or direct memory access (DMA) attacks. The third property is desirable for maximizing privacy against adversaries who do not have access to memory but nevertheless are capable of accurately measuring the execution times of data structure operations. To our knowledge, our hash table construction is the first data structure that combines history-independence and protection against a form of timing attacks

    Auditable Data Structures

    Get PDF
    The classic notion of history-independence guarantees that if a data structure is ever observed, only its current contents are revealed, not the history of operations that built it. This powerful concept has applications, for example, to e-voting and data retention compliance, where data structure histories should be private. The concept of weak history-independence (WHI) assumes only a single observation will ever occur, while strong history-independence (SHI) allows for multiple observations at arbitrary times. WHI constructions tend to be fast, but provide no repeatability, while SHI constructions provide unlimited repeatability, but tend to be slow. We introduce auditable data structures, where an auditor can observe data structures at arbitrary times (as in SHI), but we relax the unrealistic restriction that data structures cannot react to observations, since in most applications of history-independence, data owners know when observations have occurred. We consider two audit scenarios—secure topology, where an auditor can observe the contents and pointers of a data structure, and secure implementation, where an auditor can observe the memory layout of a data structure. We present a generic template for auditable data structures and, as a foundation for any auditable data structure, an Auditable Memory Manager (AMM), which is an efficient memory manager that translates any auditable data structure with a secure topology into one with a secure implementation. We give a prototype implementation that provides empirical evidence that the worst-case time running times of our AMM are 45× to 8,300× faster than those of a well-known SHI memory manager. Thus, auditable data structures provide a practical way of achieving time efficiency, as in WHI, while allowing for multiple audits, as in SHI

    Serum biomarkers identify critically ill traumatic brain injury patients for MRI

    Get PDF
    corecore